<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
<TITLE>Chapter 3</TITLE>
<META name="Robots" content="NONE">
<link rel="stylesheet" href="./style.css">
</HEAD>
<BODY>
<H2>CHAPTER 3: Pointers and Strings </H2>
<P>
The study of strings is useful to further tie in the relationship
between pointers and arrays. It also makes it easy to illustrate
how some of the standard C string functions can be implemented.
Finally it illustrates how and when pointers can and should be
passed to functions.
<P>
In C, strings are arrays of characters. This is not necessarily
true in other languages. In BASIC, Pascal, Fortran and various
other languages, a string has its own data type. But in C it does
not. In C a string is an array of characters terminated with a
binary zero character (written as <B>'\0'</B>). To start off our
discussion we will write some code which, while preferred for
illustrative purposes, you would probably never write in an actual
program. Consider, for example:
<PRE><CODE>
   char my_string[40];

    my_string[0] = 'T';
    my_string[1] = 'e';
    my_string[2] = 'd':
    my_string[3] = '\0';
</FONT>
</CODE></PRE>
<P>
While one would never build a string like this, the end result
is a string in that it is an array of characters <B>terminated
with a nul character</B>. By definition, in C, a string is an
array of characters terminated with the nul character. Be aware
that &quot;nul&quot; is <B>not</B> the same as &quot;NULL&quot;.
The nul refers to a zero as defined by the escape sequence <B>'\0'</B>.
That is it occupies one byte of memory. NULL, on the other hand,
is the name of the macro used to initialize null pointers. NULL
is #defined in a header file in your C compiler, nul may not be
#defined at all.
<P>
Since writing the above code would be very time consuming, C permits
two alternate ways of achieving the same thing. First, one might
write:
<PRE><CODE>
    char my_string[40] = {'T', 'e', 'd', '\0',};
</CODE></PRE>
<P>
But this also takes more typing than is convenient. So, C permits:
<PRE><CODE>
    char my_string[40] = &quot;Ted&quot;;
</CODE></PRE>
<P>
When the double quotes are used, instead of the single quotes
as was done in the previous examples, the nul character ( <B>'\0</B>'
) is automatically appended to the end of the string.
<P>
In all of the above cases, the same thing happens. The compiler
sets aside an contiguous block of memory 40 bytes long to hold
characters and initialized it such that the first 4 characters
are <B>Ted\0</B>.
<P>
Now, consider the following program:
<PRE><CODE>
------------------program 3.1-------------------------------------

/* Program 3.1 from PTRTUT10.HTM   6/13/97 */

#include &lt;stdio.h&gt;

char strA[80] = &quot;A string to be used for demonstration purposes&quot;;
char strB[80];

int main(void)
{

    char *pA;     /* a pointer to type character */
    char *pB;     /* another pointer to type character */
    puts(strA);   /* show string A */
    pA = strA;    /* point pA at string A */
    puts(pA);     /* show what pA is pointing to */
    pB = strB;    /* point pB at string B */
    putchar('\n');       /* move down one line on the screen */
    while(*pA != '\0')   /* line A (see text) */
    {
        *pB++ = *pA++;   /* line B (see text) */
    }
    *pB = '\0';          /* line C (see text) */
    puts(strB);          /* show strB on screen */
    return 0;
}

--------- end program 3.1 -------------------------------------


</CODE></PRE>
<P>
In the above we start out by defining two character arrays of
80 characters each. Since these are globally defined, they are
initialized to all<B> '\0</B>'s first. Then,<B> strA</B> has the
first 42 characters initialized to the string in quotes.
<P>
Now, moving into the code, we declare two character pointers and
show the string on the screen. We then &quot;point&quot; the pointer
<B>pA</B> at <B>strA</B>. That is, by means of the assignment
statement we copy the address of <B>strA[0]</B> into our variable
<B>pA</B>. We now use <B>puts()</B> to show that which is pointed
to by <B>pA</B> on the screen. Consider here that the function
prototype for <B>puts()</B> is:
<PRE><CODE>
    int puts(const char *s);
</CODE></PRE>
<P>
For the moment, ignore the <B>const</B>. The parameter passed
to <B>puts()</B> is a pointer, that is the <B>value</B> of a pointer
(since all parameters in C are passed by value), and the value
of a pointer is the address to which it points, or, simply, an
address. Thus when we write <B>puts(strA); </B>as we have seen,
we are passing the address of <B>strA[0]</B>.
<P>
Similarly, when we write <B>puts(pA);</B> we are passing the same
address, since we have set <B>pA = strA;</B>
<P>
Given that, follow the code down to the <B>while()</B> statement
on line A. Line A states:
<P>
While the character pointed to by <B>pA</B> (i.e. <B>*pA</B>)
is not a nul character (i.e. the terminating <B>'\0</B>'), do
the following:
<P>
Line B states: copy the character pointed to by <B>pA</B> to the
space pointed to by <B>pB</B>, then increment <B>pA</B> so it
points to the next character and <B>pB</B> so it points to the
next space.
<P>
When we have copied the last character, <B>pA</B> now points to
the terminating nul character and the loop ends. However, we have
not copied the nul character. And, by definition a string in C
<B>must</B> be nul terminated. So, we add the nul character with
line C.
<P>
It is very educational to run this program with your debugger
while watching <B>strA</B>, <B>strB</B>, <B>pA</B> and <B>pB</B>
and single stepping through the program. It is even more educational
if instead of simply defining <B>strB[]</B> as has been done above,
initialize it also with something like:
<PRE><CODE>
    strB[80] = &quot;12345678901234567890123456789012345678901234567890&quot;
</CODE></PRE>
<P>
where the number of digits used is greater than the length of
<B>strA</B> and then repeat the single stepping procedure while
watching the above variables. Give these things a try!
<P>
Getting back to the prototype for <B>puts()</B> for a moment,
the &quot;const&quot; used as a parameter modifier informs the
user that the function will not modify the string pointed to by
<B>s</B>, i.e. it will treat that string as a constant.
<P>
Of course, what the above program illustrates is a simple way
of copying a string. After playing with the above until you have
a good understanding of what is happening, we can proceed to creating
our own replacement for the standard <B>strcpy()</B> that comes
with C. It might look like:
<PRE><CODE>
   char *my_strcpy(char *destination, char *source)
   {
       char *p = destination;
       while (*source != '\0')
       {
           *p++ = *source++;
       }
       *p = '\0';
       return destination;
   }
</CODE></PRE>
<P>
In this case, I have followed the practice used in the standard
routine of returning a pointer to the destination.
<P>
Again, the function is designed to accept the values of two character
pointers, i.e. addresses, and thus in the previous program we
could write:
<PRE><CODE>
    int main(void)
    {
        my_strcpy(strB, strA);
        puts(strB);
    }
</CODE></PRE>
<P>
I have deviated slightly from the form used in standard C which
would have the prototype:
<PRE><CODE>
    char *my_strcpy(char *destination, const char *source);
</CODE></PRE>
<P>
Here the &quot;const&quot; modifier is used to assure the user
that the function will not modify the contents pointed to by the
source pointer. You can prove this by modifying the function above,
and its prototype, to include the &quot;const&quot; modifier as
shown. Then, within the function you can add a statement which
attempts to change the contents of that which is pointed to by
source, such as:
<PRE><CODE>
    *source = 'X';
</CODE></PRE>
<P>
which would normally change the first character of the string
to an X. The const modifier should cause your compiler to catch
this as an error. Try it and see.
<P>
Now, let's consider some of the things the above examples have
shown us. First off, consider the fact that <B>*ptr++</B> is to
be interpreted as returning the value pointed to by <B>ptr</B>
and then incrementing the pointer value. This has to do with the
precedence of the operators. Were we to write <B>(*ptr)++</B>
we would increment, not the pointer, but that which the pointer
points to! i.e. if used on the first character of the above example
string the 'T' would be incremented to a 'U'. You can write some
simple example code to illustrate this.
<P>
Recall again that a string is nothing more than an array of characters,
with the last character being a <B>'\0'</B>. What we have done
above is deal with copying an array. It happens to be an array
of characters but the technique could be applied to an array of
integers, doubles, etc. In those cases, however, we would not
be dealing with strings and hence the end of the array would not
be marked with a special value like the nul character. We could
implement a version that relied on a special value to identify
the end. For example, we could copy an array of positive integers
by marking the end with a negative integer. On the other hand,
it is more usual that when we write a function to copy an array
of items other than strings we pass the function the number of
items to be copied as well as the address of the array, e.g. something
like the following prototype might indicate:
<PRE><CODE>
    void int_copy(int *ptrA, int *ptrB, int nbr);
</CODE></PRE>
<P>
where <B>nbr</B> is the number of integers to be copied. You might
want to play with this idea and create an array of integers and
see if you can write the function <B>int_copy()</B> and make it
work.
<P>
This permits using functions to manipulate large arrays. For example,
if we have an array of 5000 integers that we want to manipulate
with a function, we need only pass to that function the address
of the array (and any auxiliary information such as nbr above,
depending on what we are doing). The array itself does <B>not</B>
get passed, i.e. the whole array is not copied and put on the
stack before calling the function, only its address is sent.
<P>
This is different from passing, say an integer, to a function.
When we pass an integer we make a copy of the integer, i.e. get
its value and put it on the stack. Within the function any manipulation
of the value passed can in no way effect the original integer.
But, with arrays and pointers we can pass the address of the variable
and hence manipulate the values of the original variables.
<DL>
<DT><CENTER><A HREF=ch4x.htm>Chapter 4: More on Strings</A></P></CENTER>
<DT><CENTER><A Href=pointers.htm>Back to Table of Contents</A></CENTER>
<DT>
<DD>
</DL>
</BODY>
</HTML>
